Introduction
We covered one-way ANOVA, randomized block design, and multifactorial
design. As you may have already figured, we started with simple
experimental designs, and gradually move towards more complex
experimental designs.
This time, we will cover experimental designs involving repeated
measures. We will cover
- Why is repeated measure a problem for ANOVA?
- What is a random effect?
- How to set up a mixed effect model?
- How to analyze a mixed effect model?
Load packages
library(lme4)
library(lmerTest) #two new packages
library(emmeans)
library(multcomp)
library(multcompView)
library(ggplot2)
library(tidyr)
library(dplyr)
library(readr)
library(readxl)
library(RColorBrewer)
library(viridis)
Why is repeated measure
a problem for ANOVA
Repeated measures refer to the fact that in some experiments, the
same entity is measured multiple times. How could that be?
Let’s imagine an experiment like this:
Say I am tracking the heights of some rice plants over time. Every
week, I go to measure the height of those plants.
And THAT is a kind of repeated measure. The same plants are
measured multiple times throughout the experiment. Whenever the same
entity is measured multiple times (either across time or space), it is a
repeated measures experiment.
Let’s go over another example. Say I am interested in the number of
seeds per plant across two genotypes of rice. There are three pots per
genotype, and three plants per pot. (All three plants in a pot are the
same genotype.) Then I count the number of seeds in each plant.
And THAT is a repeated measure. The pots are being repeatedly
measured for the plants in them. And the pots cannot be used as a
blocking factor, because there is only one genotype per pot.
But why is that a problem? Remember, the assumption of ANOVA is that
each observation is independent of the other. And repeated measures is a
violation of that assumption. When multiple measurements of the same
entity were taken, those measurements will NOT be independent of each
other, because they are all affected by the same entity.
Repeated measures always reduce the number of independent
observations you have, and one must correct for repeated measures.
Working with repeated
measures - using random effects
To correct for repeated measures, we need to declare the entities
that are repeatedly measured as “random effects”.
This is a very abstract concept. So the best way to explain is to use
an example.
We are comparing the height of two rice genotypes, WT and mutant,
through time. Every week we measure the heights of those plants. (Data
from Sundaresan Lab, Department Of Plant Biology, UC Davis)
rice_growth <- read_csv("https://raw.githubusercontent.com/johnshorter/TeachingData/refs/heads/main/rice_growth_data.csv")
head(rice_growth)
In this experiment, each pot was randomly assigned a genotype, either
WT or mutant. The mutant is called “48/48”. There was one plant per pot.
The total of 48 pots were randomly placed into 6 flats, and 8 pots per
flat. So the flat was a blocking factor. We measure the height every
week, starting from week 3 to week 9, (but not week 8), so total of 6
time points.
Before we talk about random effects, we need to talk about fixed
effects. Simply put, the fixed effects are your treatments. In this
example, We have two fixed effects: genotype and age.
Let’s look at all the factors in this experiment
| time |
treat |
6 |
| genot |
treat |
2 |
| flat |
block |
6 |
| pot |
EU |
48 |
Because there are two independent variables (time and genotype) and
one blocking factor, this experiment will be called a randomized block
multifactorial design.
Note In this example, plant or pot is the experimental unit
for genotype, because the treatment of genotype was applied to each pot,
and there is only one plant in each pot. Note Pot is also the
entity that has been repeatedly measured through weeks.
You might ask, then where is the observational unit? The OU is actual
the plant at each week, or the plant:week combination.
Coming back to random effects. What is a random effect?
- A random effect CANNOT be a treatment (because they are
fixed effects);
- An EU that is not an OU must be declared as a random effect;
- An entity that is repeatedly measured must be declared as a random
effect;
- Blocking factors can be declared as random effects, but they don’t
have to.
So in this example, what needs to be declared as a random effect? The
pot needs to be declared as a random effect, since it is both the EU for
genotype and the entity of repeated measures.
Setting up a mixed
effect model
A mixed effect model refers to a model that contains both fixed
effects and random effects. A mixed effect model is appropriate for this
experiment, because we have two fixed effects (time and genotype) and
one random effect (pot). #### note, pot is the covariate within which we
have repeated measures
How do I set up the model? To set up the mixed effect model, we will
need a new function. Instead of lm(), we will use
lmer(). This will be the command for mixed effect linear
model.
To specify a random effect, we’ll use the
(1|random_effect) syntax. In this example, it will be
(1|pot)
In the model, you will put in your fixed effect and interactions
first, then you put in blocking factors, then you put in random effects.
So the model will be
lmer(height ~ time + genotype + time:genotype + flat + (1|pot))
You can also do time*genotype + flat + (1|pot). That gives
you the same thing.
Do not include the interactions with blocking factors (flat). The
blocking factor only exists to control for unintended variations. And we
are not interested in the flat:time or flat:genotype effects.
You might ask can I do
height ~ genotype * time + flat + (1|plant)? In an
experiment like this,
you should put the non-manipulative factor (time) first,
then you put the manipulative factor (genotype) second.
The reason is that in a mixed effect model, the first factor in the
model will account for variation first; then the rest of the variation
will be passed on to the second factor in the model, and so on.
By common sense, we know plants grow taller over time. So we expect
that time explains more variation than genotype. So we will put time
first in the model.
You can also declare the flat as a random effect by using
(1|flat). Either way is correct, but setting the flat as a
random effect is more conservative, meaning you’ll need a larger
difference in mean to detect a significant difference
Last thing: before you set up the model, is there any numeric
variables that you need to change to factors?
rice_growth <- rice_growth %>%
mutate(age.f = factor(age_weeks)) %>%
mutate(flat.f = factor(flat))
model_rice <- lmer(Height_mm ~ age.f * genotype + (1|flat.f) + (1|pot), data = rice_growth)
Data visualization
Before we do any analysis, let’s plot the data and look at it first.
Plot age.f on x axis, and Height_mm on y axis. Color by genotype.
Make the best plot you can here:
wisteria <- c("grey65", "burlywood3", "khaki2", "plum1", "lightcyan2", "cornflowerblue", "slateblue3")
rice_growth %>%
ggplot(aes(x = age_weeks, y = Height_mm)) +
stat_summary(geom = "line",
fun = mean,
aes(group = genotype, color = genotype),
size = 1.2) +
stat_summary(geom = "ribbon",
fun.data = mean_se,
aes(group = genotype,
fill = genotype),
alpha = 0.5) +
geom_point(aes(fill = genotype),
position = position_jitter(0.1, seed = 666),
alpha = 0.8,
size = 3,
shape = 21,
color = "black") +
scale_fill_manual(values = wisteria[c(1, 7)]) +
scale_color_manual(values = wisteria[c(1, 7)]) +
scale_x_continuous(breaks = c(3, 4, 5, 6, 7, 9)) +
labs(x = "Age (weeks)",
y = "Height mm",
fill = "genotype",
color = "genotype") +
guides(color = "none") +
guides(fill = guide_legend(nrow = 1, ncol = 2)) + # set fill legend to 1 row 2 columns.
theme_minimal() +
theme(legend.position = c(0.8, 0.15),
axis.line = element_line(size = 1.2),
text = element_text(size = 12, color = "black", face = "bold"),
axis.text = element_text(size = 12, color = "black", face = "bold")
)

We can also pull out the summary statistics. We’ll need to group by
genotype and age.f
rice_growth %>%
group_by(genotype, age.f) %>%
summarise(mean = mean(Height_mm),
var = var(Height_mm),
sd = sd(Height_mm),
n = n())
The variances seem to increase as the means increase, but I don’t
think it’s too bad (not orders of magnitude different). If you are
interested, you can try a log transformation and compare results. I
think the conclusions should practically be the same.
Stats
We have our model, now let’s check the assumptions of ANOVA
first.
Normality? It turns out the plot(model, which = 2)
command wouldn’t work on mixed effect models. We will use
qqnorm(resid(model)) instead. The resid()
function pull out all the residues in a model.
qqnorm(resid(model_rice))

Pretty normally distributed.
Equal variance across groups?
plot(model_rice)

It turns out when you call plot() on a mixed effect
model, it gives you the residue vs. mean plot. The residues seem to be
pretty evenly spread across the range of mean. The variances are more or
less even across groups, although there is a slightly larger spread at
the high end of the data.
So now we are all clear to do ANOVA.
ANOVA
anova(model_rice)
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
age.f 16398687 3279737 5 230.000 4578.6270 < 2.2e-16 ***
genotype 16523 16523 1 41.323 23.0673 2.078e-05 ***
age.f:genotype 18386 3677 5 230.000 5.1336 0.0001747 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Not surprisingly, age.f explain the most variation in the data (F =
4578)! Genotype also explains a significant amount of variation in the
dataset (F = 23 >> 1). The interaction between the treatments are
also significant (F = 5 > 1).
Tukey test
Let’s really not a question of whether or not the height is different
across age, because we know plants grow taller over time. The real
question is whether or not WT is taller or shorter than the mutant
across age.
So the emmeans layout should be
~ genotype | age.f
estimate_rice <- emmeans(model_rice, pairwise ~ genotype | age.f)
estimate_rice$contrast
age.f = 3:
contrast estimate SE df t.ratio p.value
(48/48) - WT -21.0 10.8 119 -1.949 0.0536
age.f = 4:
contrast estimate SE df t.ratio p.value
(48/48) - WT -31.6 10.8 119 -2.929 0.0041
age.f = 5:
contrast estimate SE df t.ratio p.value
(48/48) - WT -68.6 10.8 119 -6.368 <0.0001
age.f = 6:
contrast estimate SE df t.ratio p.value
(48/48) - WT -51.2 10.8 119 -4.755 <0.0001
age.f = 7:
contrast estimate SE df t.ratio p.value
(48/48) - WT -31.5 10.8 119 -2.921 0.0042
age.f = 9:
contrast estimate SE df t.ratio p.value
(48/48) - WT -28.3 10.8 119 -2.629 0.0097
Degrees-of-freedom method: satterthwaite
You can also look at cld if you like.
cld(estimate_rice$emmeans, Letters = letters)
age.f = 3:
genotype emmean SE df lower.CL upper.CL .group
48/48 92 14.0 9.18 60.3 124 a
WT 113 13.3 7.49 81.9 144 a
age.f = 4:
genotype emmean SE df lower.CL upper.CL .group
48/48 200 14.0 9.18 167.9 231 a
WT 231 13.3 7.49 200.0 262 b
age.f = 5:
genotype emmean SE df lower.CL upper.CL .group
48/48 328 14.0 9.18 296.2 359 a
WT 396 13.3 7.49 365.3 428 b
age.f = 6:
genotype emmean SE df lower.CL upper.CL .group
48/48 498 14.0 9.18 466.0 529 a
WT 549 13.3 7.49 517.8 580 b
age.f = 7:
genotype emmean SE df lower.CL upper.CL .group
48/48 646 14.0 9.18 614.2 677 a
WT 677 13.3 7.49 646.2 708 b
age.f = 9:
genotype emmean SE df lower.CL upper.CL .group
48/48 779 14.0 9.18 747.7 811 a
WT 808 13.3 7.49 776.5 839 b
Degrees-of-freedom method: satterthwaite
Confidence level used: 0.95
significance level used: alpha = 0.05
NOTE: If two or more means share the same grouping symbol,
then we cannot show them to be different.
But we also did not show them to be the same.
What does this tell us? It tells us that in each time point (except
the first), WT plants are on average significantly taller than 48/48
plants.
(Note: The differences in heights at week 9 is 28.3 mm. In reality,
~30 mm (3cm) difference in height is practically nothing for rice
plants. Even though something is statistically significant, it does not
always mean something practically different.)
Exercise
Now you have learned how to deal with repeated measures in an
experiment, it’s time for you to practice.
We will use a mint growth experiment as an example. In this
experiment, the researchers looked at the effect of 6 treatments (A - F)
on mint growth. Each treatment has 3 pots. Each pot has 4 plants. The
treatment was applied to the pot, not the individual plants. The growth
of each plant in each pot was measured. (Data from UC Davis Plant
Sciences course PLS205, Winter 2017)
mint_data <- read_csv("https://raw.githubusercontent.com/johnshorter/TeachingData/refs/heads/main/Mint.csv")
head(mint_data)
tail(mint_data)
There is a problem with how the data were recorded. Maybe it’s not
obvious to you. In each treatment, the name of the Pot is all Pot_1,
Pot_2 and Pot_3. However, in the actual experiment, Pot_1 from Treatment
A is not the same Pot_1 for Treatment B, or CDEF. The same applies to
Pot_2 and Pot_3. Simply put, the pots in each treatment need to have
their own unique identifier. Let’s fix that first.
mint_data <- mint_data %>%
mutate(Pot.ID = paste(Pot, Trtmt))
head(mint_data)
mint_data %>%
group_by(Trtmt) %>%
summarise(mean = mean(Growth),
var = var(Growth),
sd = sd(Growth),
n = n())
Now each pot has their own unique ID. Pot is the EU that has multiple
measurements.
Visualization
Make the best plot you can to visualize the data.
What to put on x axis? What to color/fill with? Any faceting that you
want to use?
Make your plot here:
mint_data %>%
mutate(Plant = factor(Plant)) %>%
ggplot(aes(x = Trtmt, y = Growth)) +
geom_point(aes(fill = Plant),
position = position_jitter(0.1, seed = 666),
alpha = 0.8,
size = 3,
shape = 21,
color = "black") +
scale_fill_manual(values = wisteria[c(1,2,3, 7)]) +
scale_color_manual(values = wisteria[c(1,2,3, 7)]) +
#scale_x_continuous(breaks = c(3, 4, 5, 6, 7, 9)) +
labs(x = "Treatment",
y = "Growth",
fill = "Plant",
color = "Plant") +
guides(color = "none") +
#guides(fill = guide_legend(nrow = 1, ncol = 2)) + # set fill legend to 1 row 2 columns.
theme_minimal() +
theme(legend.position = "bottom",
axis.line = element_line(size = 1.2),
text = element_text(size = 12, color = "black", face = "bold"),
axis.text = element_text(size = 12, color = "black", face = "bold")
)

What is being
repeatedly measured in this experiment?
Can you tell which entity was repeatably measured?
Fill in the table and write down the model. What are the sources of
variation? What factors are they? (treatment? block? observational
units? experimental units?) How many levels in each?
What is the experimental unit for treatment? What is the
observational unit?
| Trmt |
treat |
6 |
| Pot.ID |
EU |
18 |
| Pot.ID.Plant |
OU |
72 |
Technically plant could be a blocking factor if there was something
systematic about it e.g. coming from a certain seed container or
whatever, but in this we know nothing, it’s just a number, and doesn’t
seem to affect the treatments from visualizing the data.
Set up the linear model.
Check assumptions of
ANOVA
model_mint <- lmer(Growth ~ Trtmt + (1|Pot.ID), data = mint_data) # random intercept for Pot.ID
qqnorm(resid(model_rice))

plot(model_mint)

Normality?
Equal variance?
Run ANOVA
How would you interpret the ANOVA table?
anova(model_mint)
Type III Analysis of Variance Table with Satterthwaite's method
Sum Sq Mean Sq NumDF DenDF F value Pr(>F)
Trtmt 77.942 15.588 5 12 16.689 4.881e-05 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
F value significant, means of treatment groups are different.
Tukey test
Pull out the cld output and interpret the results.
estimate_mint <- emmeans(model_mint, pairwise ~ Trtmt)
estimate_mint$contrast
contrast estimate SE df t.ratio p.value
Trtmt_A - Trtmt_B -0.458 0.599 12 -0.765 0.9685
Trtmt_A - Trtmt_C -1.542 0.599 12 -2.574 0.1776
Trtmt_A - Trtmt_D -3.667 0.599 12 -6.121 0.0006
Trtmt_A - Trtmt_E -2.792 0.599 12 -4.661 0.0056
Trtmt_A - Trtmt_F -4.250 0.599 12 -7.095 0.0001
Trtmt_B - Trtmt_C -1.083 0.599 12 -1.809 0.4954
Trtmt_B - Trtmt_D -3.208 0.599 12 -5.356 0.0018
Trtmt_B - Trtmt_E -2.333 0.599 12 -3.895 0.0202
Trtmt_B - Trtmt_F -3.792 0.599 12 -6.330 0.0004
Trtmt_C - Trtmt_D -2.125 0.599 12 -3.548 0.0364
Trtmt_C - Trtmt_E -1.250 0.599 12 -2.087 0.3542
Trtmt_C - Trtmt_F -2.708 0.599 12 -4.521 0.0071
Trtmt_D - Trtmt_E 0.875 0.599 12 1.461 0.6929
Trtmt_D - Trtmt_F -0.583 0.599 12 -0.974 0.9178
Trtmt_E - Trtmt_F -1.458 0.599 12 -2.435 0.2187
Degrees-of-freedom method: kenward-roger
P value adjustment: tukey method for comparing a family of 6 estimates
cld(estimate_mint, Letters = letters)
Trtmt emmean SE df lower.CL upper.CL .group
Trtmt_A 3.67 0.424 12 2.74 4.59 a
Trtmt_B 4.12 0.424 12 3.20 5.05 a
Trtmt_C 5.21 0.424 12 4.29 6.13 ab
Trtmt_E 6.46 0.424 12 5.54 7.38 bc
Trtmt_D 7.33 0.424 12 6.41 8.26 c
Trtmt_F 7.92 0.424 12 6.99 8.84 c
Degrees-of-freedom method: kenward-roger
Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 6 estimates
significance level used: alpha = 0.05
NOTE: If two or more means share the same grouping symbol,
then we cannot show them to be different.
But we also did not show them to be the same.
Treatment F has the highest estimate but is not significantly
different from E and D
What to submit to
Moodle (Pass/Fail)
For this lesson, complete the exercises in your shared
Exercise Pack (the file
Exercise_Pack_Template.Rmd). Then Knit to
HTML and upload the HTML file to the Moodle exercise
folder.
File name format:
Lastname_Firstname_L11.html
Satisfactory / Not
satisfactory (what I check)
Your submission is Satisfactory if:
- It knits (HTML opens and shows your work).
- You included outputs (plots/tables/model output)
that match the lesson.
- You answered the Self-check checklist (short,
honest, and complete).
- Your work is plausible (right method for the
lesson, correct variables, no obvious copy/paste without results).
If one or more of these are missing, it is Not
satisfactory (and you resubmit). ## Additional exercises
(complete these in your Exercise Pack)
Exercise 2 — Random
intercept model
Fit a mixed model with a random intercept for subject (or repeated
unit). Compare it to a model without the random effect.
avg_mint <- mint_data %>% split(mint_data$Pot.ID) %>% lapply(function(x){
avg.growth <- mean(x$Growth)
unique.vars <- x %>% dplyr::select(c(Trtmt, Pot.ID)) %>% unique()
unique.vars %<>% mutate(avg_growth = avg.growth)
return(unique.vars)
}) %>% bind_rows()
model_mint_avg <- lm(avg_growth ~ Trtmt, data = avg_mint)
plot(model_mint_avg)




anova(model_mint_avg)
Analysis of Variance Table
Response: avg_growth
Df Sum Sq Mean Sq F value Pr(>F)
Trtmt 5 44.911 8.9821 16.689 4.881e-05 ***
Residuals 12 6.458 0.5382
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
estimate_mint_avg <- emmeans(model_mint_avg, pairwise ~ Trtmt)
estimate_mint_avg$contrast
contrast estimate SE df t.ratio p.value
Trtmt_A - Trtmt_B -0.458 0.599 12 -0.765 0.9685
Trtmt_A - Trtmt_C -1.542 0.599 12 -2.574 0.1776
Trtmt_A - Trtmt_D -3.667 0.599 12 -6.121 0.0006
Trtmt_A - Trtmt_E -2.792 0.599 12 -4.661 0.0056
Trtmt_A - Trtmt_F -4.250 0.599 12 -7.095 0.0001
Trtmt_B - Trtmt_C -1.083 0.599 12 -1.809 0.4954
Trtmt_B - Trtmt_D -3.208 0.599 12 -5.356 0.0018
Trtmt_B - Trtmt_E -2.333 0.599 12 -3.895 0.0202
Trtmt_B - Trtmt_F -3.792 0.599 12 -6.330 0.0004
Trtmt_C - Trtmt_D -2.125 0.599 12 -3.548 0.0364
Trtmt_C - Trtmt_E -1.250 0.599 12 -2.087 0.3542
Trtmt_C - Trtmt_F -2.708 0.599 12 -4.521 0.0071
Trtmt_D - Trtmt_E 0.875 0.599 12 1.461 0.6929
Trtmt_D - Trtmt_F -0.583 0.599 12 -0.974 0.9178
Trtmt_E - Trtmt_F -1.458 0.599 12 -2.435 0.2187
P value adjustment: tukey method for comparing a family of 6 estimates
cld(estimate_mint_avg, Letters = letters)
Trtmt emmean SE df lower.CL upper.CL .group
Trtmt_A 3.67 0.424 12 2.74 4.59 a
Trtmt_B 4.12 0.424 12 3.20 5.05 a
Trtmt_C 5.21 0.424 12 4.29 6.13 ab
Trtmt_E 6.46 0.424 12 5.54 7.38 bc
Trtmt_D 7.33 0.424 12 6.41 8.26 c
Trtmt_F 7.92 0.424 12 6.99 8.84 c
Confidence level used: 0.95
P value adjustment: tukey method for comparing a family of 6 estimates
significance level used: alpha = 0.05
NOTE: If two or more means share the same grouping symbol,
then we cannot show them to be different.
But we also did not show them to be the same.
Interestingly we get the exact same results. It turns out that with a
balanced dataset the mixed model reduces to averaging by the repeated
measurement covariates.
Self-check: you show both models and a short
justification for the random effect.
Exercise 3 —
Trajectory plot
Make a plot showing repeated measurements over time (or condition)
per subject.
Self-check: each subject has multiple points;
patterns are visible.
Exercise 4 —
Assumptions and residuals
Check residual diagnostics for the mixed model and state one
limitation.
Self-check: at least one diagnostic plot + one
limitation statement.
Exercise 5 —
Mini-transfer
Apply a mixed model idea to a different dataset (built-in or from the
lesson) with a sensible random effect.
Self-check: you clearly identify what is the
repeated/clustered unit.
---
title: "11 - Repeated measures and mixed effect models"
author: "John Shorter"
date: "19/Dec/2026"
output:
  html_document:
    toc: yes  
  html_notebook:   
    number_sections: yes    
    toc: yes  
    toc_float: yes
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(magrittr)
```

# Introduction

We covered one-way ANOVA, randomized block design, and multifactorial design.
As you may have already figured, we started with simple experimental designs,
and gradually move towards more complex experimental designs.

This time, we will cover experimental designs involving repeated measures. We will cover

1.  Why is repeated measure a problem for ANOVA?
2.  What is a random effect?
3.  How to set up a mixed effect model?
4.  How to analyze a mixed effect model?

# Load packages

```{r}
library(lme4)
library(lmerTest)   #two new packages


library(emmeans)  
library(multcomp)
library(multcompView)
library(ggplot2) 
library(tidyr)
library(dplyr)
library(readr)
library(readxl)
library(RColorBrewer)
library(viridis)
```

# Why is repeated measure a problem for ANOVA

Repeated measures refer to the fact that in some experiments, the same entity is measured multiple times.
How could that be?

Let's imagine an experiment like this:

Say I am tracking the heights of some rice plants over time.
Every week, I go to measure the height of those plants.

And *THAT* is a kind of repeated measure. The same plants are measured multiple times throughout the experiment.
Whenever the same entity is measured multiple times (either across time or space),
it is a repeated measures experiment.

Let's go over another example.
Say I am interested in the number of seeds per plant across two genotypes of rice.
There are three pots per genotype, and three plants per pot. (All three plants in a pot are the same genotype.)
Then I count the number of seeds in each plant.\
And *THAT* is a repeated measure. The pots are being repeatedly measured for the plants in them.
And the pots cannot be used as a blocking factor, because there is only one genotype per pot.

But why is that a problem?
Remember, the assumption of ANOVA is that each observation is independent of the other.
And repeated measures is a violation of that assumption.
When multiple measurements of the same entity were taken,
those measurements will NOT be independent of each other,
because they are all affected by the same entity.

Repeated measures always reduce the number of independent observations you have,
and one must correct for repeated measures.

# Working with repeated measures - using random effects

To correct for repeated measures, we need to declare the entities that are repeatedly measured as "random effects".

This is a very abstract concept. So the best way to explain is to use an example.

We are comparing the height of two rice genotypes, WT and mutant, through time.
Every week we measure the heights of those plants.
(Data from Sundaresan Lab, Department Of Plant Biology, UC Davis)

```{r}
rice_growth <- read_csv("https://raw.githubusercontent.com/johnshorter/TeachingData/refs/heads/main/rice_growth_data.csv")
head(rice_growth)
```

In this experiment, each pot was randomly assigned a genotype, either WT or mutant. The mutant is called "48/48".
There was one plant per pot.
The total of 48 pots were randomly placed into 6 flats, and 8 pots per flat.
So the flat was a blocking factor.
We measure the height every week, starting from week 3 to week 9, (but not week 8), so total of 6 time points.

Before we talk about random effects, we need to talk about fixed effects.
Simply put, the fixed effects are your treatments.
In this example, We have two fixed effects: genotype and age.

Let's look at all the factors in this experiment

| sources | factor | levels |
|:-------:|:------:|:------:|
|  time   | treat  |   6    |
|  genot  | treat  |   2    |
|  flat   | block  |   6    |
|   pot   |  *EU*  |   48   |

Because there are two independent variables (time and genotype) and one blocking factor,
this experiment will be called a randomized block multifactorial design.

*Note* In this example, plant or pot is the experimental unit for genotype,
because the treatment of genotype was applied to each pot, and there is only one plant in each pot.
*Note* Pot is also the entity that has been repeatedly measured through weeks.

You might ask, then where is the observational unit?
The OU is actual the plant at each week, or the plant:week combination.

Coming back to random effects. What is a random effect?

1.  A random effect *CANNOT* be a treatment (because they are fixed effects);
2.  An EU that is not an OU must be declared as a random effect;
3.  An entity that is repeatedly measured must be declared as a random effect;
4.  Blocking factors can be declared as random effects, but they don't have to.

So in this example, what needs to be declared as a random effect?
The pot needs to be declared as a random effect,
since it is both the EU for genotype and the entity of repeated measures.

# Setting up a mixed effect model

A mixed effect model refers to a model that contains both fixed effects and random effects.
A mixed effect model is appropriate for this experiment,
because we have two fixed effects (time and genotype) and one random effect (pot).
#### note, pot is the covariate within which we have repeated measures

How do I set up the model?
To set up the mixed effect model, we will need a new function.
Instead of `lm()`, we will use `lmer()`. This will be the command for mixed effect linear model.

To specify a random effect, we'll use the `(1|random_effect)` syntax.
In this example, it will be `(1|pot)`

In the model, you will put in your fixed effect and interactions first,
then you put in blocking factors,
then you put in random effects.
So the model will be `lmer(height ~ time + genotype + time:genotype + flat + (1|pot))`
You can also do `time*genotype + flat + (1|pot).` That gives you the same thing.

Do not include the interactions with blocking factors (flat).
The blocking factor only exists to control for unintended variations.
And we are not interested in the flat:time or flat:genotype effects.

You might ask can I do `height ~ genotype * time + flat + (1|plant)`? In an experiment like this,\
you should put the non-manipulative factor (time) first,\
then you put the manipulative factor (genotype) second.

The reason is that in a mixed effect model, the first factor in the model will account for variation first;
then the rest of the variation will be passed on to the second factor in the model,
and so on.

By common sense, we know plants grow taller over time. So we expect that time explains more variation than genotype.
So we will put time first in the model.

You can also declare the flat as a random effect by using `(1|flat)`.
Either way is correct, but setting the flat as a random effect is more conservative,
meaning you'll need a larger difference in mean to detect a significant difference

Last thing: before you set up the model, is there any numeric variables that you need to change to factors?

```{r}
rice_growth <- rice_growth %>% 
  mutate(age.f = factor(age_weeks)) %>% 
  mutate(flat.f = factor(flat)) 
```

```{r}
model_rice <- lmer(Height_mm ~ age.f * genotype + (1|flat.f) + (1|pot), data = rice_growth)
```

# Data visualization

Before we do any analysis, let's plot the data and look at it first.
Plot age.f on x axis, and Height_mm on y axis. Color by genotype.

Make the best plot you can here:

```{r}
wisteria <- c("grey65", "burlywood3", "khaki2", "plum1", "lightcyan2", "cornflowerblue", "slateblue3")

rice_growth %>% 
  ggplot(aes(x = age_weeks, y = Height_mm)) +
  stat_summary(geom = "line",  
               fun = mean,  
               aes(group = genotype, color = genotype),  
               size = 1.2) + 
  stat_summary(geom = "ribbon",   
               fun.data = mean_se, 
               aes(group = genotype,
                   fill = genotype), 
               alpha = 0.5) +   
  geom_point(aes(fill = genotype), 
             position = position_jitter(0.1, seed = 666),    
             alpha = 0.8,             
             size = 3,
             shape = 21, 
             color = "black") +    
  scale_fill_manual(values = wisteria[c(1, 7)]) +  
  scale_color_manual(values = wisteria[c(1, 7)]) + 
  scale_x_continuous(breaks = c(3, 4, 5, 6, 7, 9)) +  
  labs(x = "Age (weeks)",              
       y = "Height mm",
       fill = "genotype",
       color = "genotype") +
  guides(color = "none") +
  guides(fill = guide_legend(nrow = 1, ncol = 2)) + # set fill legend to 1 row 2 columns. 
  theme_minimal() +
  theme(legend.position = c(0.8, 0.15),
        axis.line = element_line(size = 1.2), 
        text = element_text(size = 12, color = "black", face = "bold"),
        axis.text = element_text(size = 12, color = "black", face = "bold")
        )

```

We can also pull out the summary statistics.
We'll need to group by genotype and age.f

```{r}
rice_growth %>% 
  group_by(genotype, age.f) %>% 
  summarise(mean = mean(Height_mm),
            var = var(Height_mm),
            sd = sd(Height_mm),
            n = n()) 
```

The variances seem to increase as the means increase, but I don't think it's too bad (not orders of magnitude different).
If you are interested, you can try a log transformation and compare results. 
I think the conclusions should practically be the same.

# Stats

We have our model, now let's check the assumptions of ANOVA first.

Normality?
It turns out the `plot(model, which = 2)` command wouldn't work on mixed effect models.
We will use `qqnorm(resid(model))` instead.
The `resid()` function pull out all the residues in a model.

```{r}
qqnorm(resid(model_rice))
```

Pretty normally distributed.

Equal variance across groups?

```{r}
plot(model_rice)
```

It turns out when you call `plot()` on a mixed effect model, it gives you the residue vs. mean plot.
The residues seem to be pretty evenly spread across the range of mean.
The variances are more or less even across groups, although there is a slightly larger spread at the high end of the data.

So now we are all clear to do ANOVA.

## ANOVA

```{r}
anova(model_rice)
```

Not surprisingly, age.f explain the most variation in the data (F = 4578)!
Genotype also explains a significant amount of variation in the dataset (F = 23 >> 1).
The interaction between the treatments are also significant (F = 5 > 1).

## Tukey test

Let's really not a question of whether or not the height is different across age,
because we know plants grow taller over time.
The real question is whether or not WT is taller or shorter than the mutant across age.

So the `emmeans` layout should be `~ genotype | age.f`

```{r}
estimate_rice <- emmeans(model_rice, pairwise ~ genotype | age.f)
estimate_rice$contrast
```

You can also look at cld if you like.

```{r}
cld(estimate_rice$emmeans, Letters = letters)
```

What does this tell us?
It tells us that in each time point (except the first),
WT plants are on average significantly taller than 48/48 plants.

(Note: The differences in heights at week 9 is 28.3 mm.
In reality, ~30 mm (3cm) difference in height is practically nothing for rice plants.
Even though something is statistically significant, it does not always mean something practically different.)

# Exercise

Now you have learned how to deal with repeated measures in an experiment,
it's time for you to practice.

We will use a mint growth experiment as an example.
In this experiment, the researchers looked at the effect of 6 treatments (A - F) on mint growth.
Each treatment has 3 pots.
Each pot has 4 plants. The treatment was applied to the pot, not the individual plants.
The growth of each plant in each pot was measured.
(Data from UC Davis Plant Sciences course PLS205, Winter 2017)

```{r}
mint_data <- read_csv("https://raw.githubusercontent.com/johnshorter/TeachingData/refs/heads/main/Mint.csv")
head(mint_data)
tail(mint_data)

```

There is a problem with how the data were recorded. Maybe it's not obvious to you.
In each treatment, the name of the Pot is all Pot_1, Pot_2 and Pot_3.
However, in the actual experiment, Pot_1 from Treatment A is not the same Pot_1 for Treatment B, or CDEF.
The same applies to Pot_2 and Pot_3.
Simply put, the pots in each treatment need to have their own unique identifier.
Let's fix that first.

```{r}
mint_data <- mint_data %>% 
  mutate(Pot.ID = paste(Pot, Trtmt))

head(mint_data)
mint_data %>% 
  group_by(Trtmt) %>% 
  summarise(mean = mean(Growth),
            var = var(Growth),
            sd = sd(Growth),
            n = n()) 
```

Now each pot has their own unique ID.
Pot is the EU that has multiple measurements.

## Visualization

Make the best plot you can to visualize the data.

What to put on x axis? What to color/fill with? Any faceting that you want to use?

Make your plot here:

```{r}
mint_data %>% 
  mutate(Plant = factor(Plant)) %>% 
  ggplot(aes(x = Trtmt, y = Growth)) +
  geom_point(aes(fill = Plant), 
             position = position_jitter(0.1, seed = 666),    
             alpha = 0.8,             
             size = 3,
             shape = 21, 
             color = "black") +    
  scale_fill_manual(values = wisteria[c(1,2,3, 7)]) +  
  scale_color_manual(values = wisteria[c(1,2,3, 7)]) + 
  #scale_x_continuous(breaks = c(3, 4, 5, 6, 7, 9)) +  
  labs(x = "Treatment",              
       y = "Growth",
       fill = "Plant",
       color = "Plant") +
  guides(color = "none") +
  #guides(fill = guide_legend(nrow = 1, ncol = 2)) + # set fill legend to 1 row 2 columns. 
  theme_minimal() +
  theme(legend.position = "bottom",
        axis.line = element_line(size = 1.2), 
        text = element_text(size = 12, color = "black", face = "bold"),
        axis.text = element_text(size = 12, color = "black", face = "bold")
        )
```

## What is being repeatedly measured in this experiment?

Can you tell which entity was repeatably measured?

Fill in the table and write down the model.
What are the sources of variation?
What factors are they? (treatment? block? observational units? experimental units?)
How many levels in each?


What is the experimental unit for treatment?
What is the observational unit?

| sources | factor | levels |
|:-------:|:------:|:------:|
|  Trmt   | treat  |   6    |
| Pot.ID  |  *EU*  |   18   |
| Pot.ID.Plant | *OU* | 72 |

Technically plant could be a blocking factor if there was something systematic about it e.g. coming from a certain seed container or whatever, but in this we know nothing, it's just a number, and doesn't seem to affect the treatments from visualizing the data.

Set up the linear model.

## Check assumptions of ANOVA

```{r}
model_mint <- lmer(Growth ~ Trtmt + (1|Pot.ID), data = mint_data) # random intercept for Pot.ID
qqnorm(resid(model_rice))
plot(model_mint)
```

Normality?

Equal variance?

## Run ANOVA

How would you interpret the ANOVA table?

```{r}
anova(model_mint)
```
F value significant, means of treatment groups are different.

## Tukey test

Pull out the cld output and interpret the results.

```{r}
estimate_mint <- emmeans(model_mint, pairwise ~ Trtmt)
estimate_mint$contrast
```

```{r}
cld(estimate_mint, Letters = letters)
```

Treatment F has the highest estimate but is not significantly different from E and D

## What to submit to Moodle (Pass/Fail)

For this lesson, complete the exercises **in your shared Exercise Pack** (the file `Exercise_Pack_Template.Rmd`).
Then **Knit to HTML** and upload the HTML file to the Moodle exercise folder.

**File name format:** `Lastname_Firstname_L11.html`

### Satisfactory / Not satisfactory (what I check)

Your submission is **Satisfactory** if:

- It **knits** (HTML opens and shows your work).
- You included **outputs** (plots/tables/model output) that match the lesson.
- You answered the **Self-check checklist** (short, honest, and complete).
- Your work is **plausible** (right method for the lesson, correct variables, no obvious copy/paste without results).

If one or more of these are missing, it is **Not satisfactory** (and you resubmit).
## Additional exercises (complete these in your Exercise Pack)

### Exercise 2 — Random intercept model
Fit a mixed model with a random intercept for subject (or repeated unit).
Compare it to a model without the random effect.

```{r}
avg_mint <- mint_data %>% split(mint_data$Pot.ID) %>% lapply(function(x){
  avg.growth <- mean(x$Growth)
  unique.vars <- x %>% dplyr::select(c(Trtmt, Pot.ID)) %>% unique()
  unique.vars %<>% mutate(avg_growth = avg.growth)
  return(unique.vars)
}) %>% bind_rows()

model_mint_avg <- lm(avg_growth ~ Trtmt, data = avg_mint)
plot(model_mint_avg)
```


```{r}
anova(model_mint_avg)
```

```{r}
estimate_mint_avg <- emmeans(model_mint_avg, pairwise ~ Trtmt)
estimate_mint_avg$contrast
cld(estimate_mint_avg, Letters = letters)
```

Interestingly we get the exact same results. It turns out that with a balanced dataset the mixed model reduces to averaging by the repeated measurement covariates.

**Self-check:** you show both models and a short justification for the random effect.

### Exercise 3 — Trajectory plot
Make a plot showing repeated measurements over time (or condition) per subject.

**Self-check:** each subject has multiple points; patterns are visible.

### Exercise 4 — Assumptions and residuals
Check residual diagnostics for the mixed model and state one limitation.

**Self-check:** at least one diagnostic plot + one limitation statement.

### Exercise 5 — Mini-transfer
Apply a mixed model idea to a different dataset (built-in or from the lesson) with a sensible random effect.

**Self-check:** you clearly identify what is the repeated/clustered unit.
